9.5K
Publications
562.3K
Citations
17.2K
Authors
3.2K
Institutions
Integrated Prosodic Coding
1959 - 1965
During 1959–1965, research converged on treating prosody as an integrated, multi-channel code in which fundamental frequency, amplitude, and duration jointly signal intonation and stress across languages. Onset timing and voicing contrasts emerged as central cues for differentiating speech events and articulatory actions, guiding experimental designs and early perceptual judgments. Measurement innovations such as cinefluorography and electromyography began to connect articulation with acoustic output, while binaural psychoacoustics offered a practical framework for understanding perceptual masking and separability; physiological state, including aging and laryngeal factors, modulated pitch and voice quality measures. Historical Significance: These developments established core prosodic cues and cross-language voicing patterns that underpinned later theories of perception, speech processing, and prosodic rhythm. The introduction of articulatory-to-acoustic linkage methods and measurement tools created enduring datasets and methodologies that influenced multimodal phonetics and intelligibility research. Foundational studies on vowel duration, word stress cues, and cross-language voicing provided benchmarks for future research in speech synthesis, language processing, and hearing sciences.
• Prosody and stress patterns emerge as a multi-channel acoustic code: F0, amplitude, and duration jointly signal intonation and stress in both English and cross-linguistic contexts, suggesting integrated perceptual cues [1][2][19][14].
• Temporal dynamics and onset timing act as central cues for distinguishing speech events and articulatory actions, as shown by onset-time discrimination, voicing contrasts in initial stops, and intervocalic timing patterns across contexts [3][6][7].
• Measurement technologies bridge articulation and acoustics: cinefluorography visualizes tongue/mouth movements, electromyography records muscle activity, and velopharyngeal closure links articulatory gestures to vowel quality [5][8][12].
• Binaural psychoacoustics and masking-level differences become a guiding framework, combining equalization/cancellation models with empirical data to explain perceptual separability and masking [18][20].
• Phonation, laryngeal correlates, and aging shape acoustic measures of pitch, periodicity, and voice quality, illustrating how physiological state modulates acoustic patterns and perceptual judgments [10][15][11].
Parametric Speech Modeling
1966 - 1972
Dynamic Formant-Transition Perception
1973 - 1979
Gestural Articulatory Phonology
1980 - 1987
Contextual Acoustic Modeling
1988 - 1994
Spectrotemporal Phonetics and Statistical Learning
1995 - 2001
Contextual Hidden Markov Models
2002 - 2008
Adaptive Speech Perception
2009 - 2016
Neural Speech Systems
2017 - 2023